Private Information Disclosure from Web Searches. (The case of Google Web History)

نویسندگان

  • Claude Castelluccia
  • Emiliano De Cristofaro
  • Daniele Perito
چکیده

As the amount of personal information stored at remote service providers increases, so does the danger of data theft. When connections to remote services are made in the clear and authenticated sessions are kept using HTTP cookies, data theft becomes extremely easy to achieve. In this paper, we study the architecture of the world’s largest service provider, i.e., Google. First, with the exception of a few services that can only be accessed over HTTPS (e.g., Gmail), we find that many Google services are still vulnerable to simple session hijacking. Next, we present the Historiographer, a novel attack that reconstructs the web search history of Google users, i.e., Google’s Web History, even though such a service is supposedly protected from session hijacking by a stricter access control policy. The Historiographer uses a reconstruction technique inferring search history from the personalized suggestions fed by the Google search engine. We validate our technique through experiments conducted over real network traffic and discuss possible countermeasures. Our attacks are general and not only specific to Google, and highlight privacy concerns of mixed architectures using both secure and insecure connections. Update: Our report was sent to Google on February 23rd, 2010. Google is investigating the problem and has decided to temporarily suspend search suggestions from Search History. Furthermore, Google Web History page is now offered over HTTPS only. Updated information about this project is available at: http://planete.inrialpes.fr/projects/private-information-disclosure-from-web-searches

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Private Information Disclosure from Web Searches

As the amount of personal information stored at remote service providers increases, so does the danger of data theft. When connections to remote services are made in the clear and authenticated sessions are kept using HTTP cookies, data theft becomes extremely easy to achieve. In this paper, we study the architecture of the world’s largest service provider, i.e., Google. First, with the excepti...

متن کامل

Towards Enforceable Data-Driven Privacy Policies

A defining characteristic of current web applications is that they are personalized according to the interests and preferences of individual users; popular examples are Google News and Amazon.com. While this paradigm shift is generally viewed as positive by both users and content providers, it introduces privacy concerns, as the data needed to drive this functionality is often considered privat...

متن کامل

Towards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore

Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...

متن کامل

Toward a Privacy Agent for Information Retrieval

In this paper, we tackle the private information retrieval (PIR) problem associated with the use of Internet search engines. We address the desire for a user to retrieve information from the Web without the search provider learning about it. Traditional PIR protocols present two main shortcomings for their application: (i) They assume cooperation by the database, which is not affordable for a r...

متن کامل

Google Web 1T 5-Grams Made Easy (but not for the computer)

This paper introduces Web1T5-Easy, a simple indexing solution that allows interactive searches of the Web 1T 5-gram database and a derived database of quasi-collocations. The latter is validated against co-occurrence data from the BNC and ukWaC on the automatic identification of non-compositional VPC.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1003.3242  شماره 

صفحات  -

تاریخ انتشار 2010